128 research outputs found

    A general statistical framework for dissecting parent-of-origin effects underlying endosperm traits in flowering plants

    Full text link
    Genomic imprinting has been thought to play an important role in seed development in flowering plants. Seed in a flowering plant normally contains diploid embryo and triploid endosperm. Empirical studies have shown that some economically important endosperm traits are genetically controlled by imprinted genes. However, the exact number and location of the imprinted genes are largely unknown due to the lack of efficient statistical mapping methods. Here we propose a general statistical variance components framework by utilizing the natural information of sex-specific allelic sharing among sibpairs in line crosses, to map imprinted quantitative trait loci (iQTL) underlying endosperm traits. We propose a new variance components partition method considering the unique characteristic of the triploid endosperm genome, and develop a restricted maximum likelihood estimation method in an interval scan for estimating and testing genome-wide iQTL effects. Cytoplasmic maternal effect which is thought to have primary influences on yield and grain quality is also considered when testing for genomic imprinting. Extension to multiple iQTL analysis is proposed. Asymptotic distribution of the likelihood ratio test for testing the variance components under irregular conditions are studied. Both simulation study and real data analysis indicate good performance and powerfulness of the developed approach.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS323 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A multilocus likelihood approach to joint modeling of linkage, parental diplotype and gene order in a full-sib family

    Get PDF
    BACKGROUND: Unlike a pedigree initiated with two inbred lines, a full-sib family derived from two outbred parents frequently has many different segregation types of markers whose linkage phases are not known prior to linkage analysis. RESULTS: We formulate a general model of simultaneously estimating linkage, parental diplotype and gene order through multi-point analysis in a full-sib family. Our model is based on a multinomial mixture model taking into account different diplotypes and gene orders, weighted by their corresponding occurring probabilities. The EM algorithm is implemented to provide the maximum likelihood estimates of the linkage, parental diplotype and gene order over any type of markers. CONCLUSIONS: Through simulation studies, this model is found to be more computationally efficient compared with existing models for linkage mapping. We discuss the extension of the model and its implications for genome mapping in outcrossing species

    Empirical Likelihood Ratio Tests for Coe cients in High Dimensional Heteroscedastic Linear Models

    Get PDF
    This paper considers hypothesis testing problems for a low-dimensional coefficient vector in a high-dimensional linear model with heteroscedastic variance. Heteroscedasticity is a commonly observed phenomenon in many applications, including finance and genomic studies. Several statistical inference procedures have been proposed for low-dimensional coefficients in a high-dimensional linear model with homoscedastic variance, which are not applicable for models with heteroscedastic variance. The heterscedasticity issue has been rarely investigated and studied. We propose a simple inference procedure based on empirical likelihood to overcome the heteroscedasticity issue. The proposed method is able to make valid inference even when the conditional variance of random error is an unknown function of high-dimensional predictors. We apply our inference procedure to three recently proposed estimating equations and establish the asymptotic distributions of the proposed methods. Simulation studies and real data applications are conducted to demonstrate the proposed methods

    Unified empirical likelihood ratio tests for functional concurrent linear models and the phase transition from sparse to dense functional data

    Get PDF
    We consider the problem of testing functional constraints in a class of functional concurrent linear models where both the predictors and the response are functional data measured at discrete time points. We propose test procedures based on the empirical likelihood with bias‐corrected estimating equations to conduct both pointwise and simultaneous inferences. The asymptotic distributions of the test statistics are derived under the null and local alternative hypotheses, where sparse and dense functional data are considered in a unified framework. We find a phase transition in the asymptotic null distributions and the orders of detectable alternatives from sparse to dense functional data. Specifically, the tests proposed can detect alternatives of √n‐order when the number of repeated measurements per curve is of an order larger than urn:x-wiley:13697412:media:rssb12246:rssb12246-math-0001 with n being the number of curves. The transition points urn:x-wiley:13697412:media:rssb12246:rssb12246-math-0002 for pointwise and simultaneous tests are different and both are smaller than the transition point in the estimation problem. Simulation studies and real data analyses are conducted to demonstrate the methods proposed

    Mapping Haplotype-haplotype Interactions with Adaptive LASSO

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The genetic etiology of complex diseases in human has been commonly viewed as a complex process involving both genetic and environmental factors functioning in a complicated manner. Quite often the interactions among genetic variants play major roles in determining the susceptibility of an individual to a particular disease. Statistical methods for modeling interactions underlying complex diseases between single genetic variants (e.g. single nucleotide polymorphisms or SNPs) have been extensively studied. Recently, haplotype-based analysis has gained its popularity among genetic association studies. When multiple sequence or haplotype interactions are involved in determining an individual's susceptibility to a disease, it presents daunting challenges in statistical modeling and testing of the interaction effects, largely due to the complicated higher order epistatic complexity.</p> <p>Results</p> <p>In this article, we propose a new strategy in modeling haplotype-haplotype interactions under the penalized logistic regression framework with adaptive <it>L</it><sub>1</sub>-penalty. We consider interactions of sequence variants between haplotype blocks. The adaptive <it>L</it><sub>1</sub>-penalty allows simultaneous effect estimation and variable selection in a single model. We propose a new parameter estimation method which estimates and selects parameters by the modified Gauss-Seidel method nested within the EM algorithm. Simulation studies show that it has low false positive rate and reasonable power in detecting haplotype interactions. The method is applied to test haplotype interactions involved in mother and offspring genome in a small for gestational age (SGA) neonates data set, and significant interactions between different genomes are detected.</p> <p>Conclusions</p> <p>As demonstrated by the simulation studies and real data analysis, the approach developed provides an efficient tool for the modeling and testing of haplotype interactions. The implementation of the method in R codes can be freely downloaded from <url>http://www.stt.msu.edu/~cui/software.html</url>.</p

    The effect of multiple genetic variants in predicting the risk of type 2 diabetes

    Get PDF
    While recently performed genome-wide association studies have advanced the identification of genetic variants predisposing to type 2 diabetes (T2D), the potential application of these novel findings for disease prediction and prevention has not been well studied. Diabetes prediction and prevention have become urgent issues owing to the rapidly increasing prevalence of diabetes and its associated mortality, morbidity, and health care cost. New prediction approaches using genetic markers could facilitate early identification of high risk sub-groups of the population so that appropriate prevention methods could be effectively applied to delay, or even prevent, disease onset
    • 

    corecore